AITopics | low data regime

Collaborating Authors

low data regime

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Uncertainty Quantification and Deep Ensembles

Neural Information Processing SystemsDec-24-2025, 16:23:06 GMT

Deep Learning methods are known to suffer from calibration issues: they typically produce over-confident estimates. These problems are exacerbated in the low data regime. Although the calibration of probabilistic models is well studied, calibrating extremely over-parametrized models in the low-data regime presents unique challenges. We show that deep-ensembles do not necessarily lead to improved calibration properties. In fact, we show that standard ensembling methods, when used in conjunction with modern techniques such as mixup regularization, can lead to less calibrated models. This text examines the interplay between three of the most simple and commonly used approaches to leverage deep learning when data is scarce: data-augmentation, ensembling, and post-processing calibration methods. We demonstrate that, although standard ensembling techniques certainly help to boost accuracy, the calibration of deep ensembles relies on subtle trade-offs. We also find that calibration methods such as temperature scaling need to be slightly tweaked when used with deep-ensembles and, crucially, need to be executed after the averaging process. Our simulations indicate that, in the low data regime, this simple strategy can halve the Expected Calibration Error (ECE) on a range of benchmark classification problems when compared to standard deep-ensembles.

electronic proceedings, name change, uncertainty quantification and deep ensemble, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.83)

Add feedback

Personalized Federated Learning With Gaussian Processes

Neural Information Processing SystemsDec-24-2025, 01:41:30 GMT

Federated learning aims to learn a global model that performs well on client devices with limited cross-client communication. Personalized federated learning (PFL) further extends this setup to handle data heterogeneity between clients by learning personalized models. A key challenge in this setting is to learn effectively across clients even though each client has unique data that is often limited in size. Here we present pFedGP, a solution to PFL that is based on Gaussian processes (GPs) with deep kernel learning. GPs are highly expressive models that work well in the low data regime due to their Bayesian nature.However, applying GPs to PFL raises multiple challenges.

gaussian process, name change, personalized federated learning, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Predicting protein-ligand binding affinity is an essential part of computer-aided drug design. However, generalisable and performant global binding affinity models remain elusive, particularly in low data regimes. Despite the evolution of model architectures, current benchmarks are not well-suited to probe the generalisability of 3D binding affinity models. Furthermore, 3D global architectures such as GNNs have not lived up to performance expectations. To investigate these issues, we introduce a novel split of the PDBBind dataset, minimizing similarity leakage between train and test sets and allowing for a fair and direct comparison between various model architectures. On this low similarity split, we demonstrate that, in general, 3D global models are superior to protein-specific local models in low data regimes. We also demonstrate that the performance of GNNs benefits from three novel contributions: supervised pre-training via quantum mechanical data, unsupervised pre-training via small molecule diffusion, and explicitly modeling hydrogen atoms in the input graph. We believe that this work introduces promising new approaches to unlock the potential of GNN architectures for binding affinity modelling.

dataset, low data regime, protein, (12 more...)

arXiv.org Artificial Intelligence

2409.12995

Genre: Research Report (0.65)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

On the calibration of compartmental epidemiological models

Gupta, Nikunj, Mai, Anh, Abouzied, Azza, Shasha, Dennis

arXiv.org Artificial IntelligenceDec-8-2023

Epidemiological compartmental models are useful for understanding infectious disease propagation and directing public health policy decisions. Calibration of these models is an important step in offering accurate forecasts of disease dynamics and the effectiveness of interventions. In this study, we present an overview of calibrating strategies that can be employed, including several optimization methods and reinforcement learning (RL). We discuss the benefits and drawbacks of these methods and highlight relevant practical conclusions from our experiments. Optimization methods iteratively adjust the parameters of the model until the model output matches the available data, whereas RL uses trial and error to learn the optimal set of parameters by maximizing a reward signal. Finally, we discuss how the calibration of parameters of epidemiological compartmental models is an emerging field that has the potential to improve the accuracy of disease modeling and public health decision-making. Further research is needed to validate the effectiveness and scalability of these approaches in different epidemiological contexts. All codes and resources are available on https://github.com/Nikunj-Gupta/

compartmental model, data regime, subgroup, (14 more...)

arXiv.org Artificial Intelligence

2312.05456

Country:

North America > United States > New York (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
North America > United States > Tennessee (0.04)
Europe > United Kingdom (0.04)

Genre:

Overview (0.54)
Research Report > New Finding (0.34)

Industry:

Health & Medicine > Public Health (1.00)
Health & Medicine > Epidemiology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

A Comprehensive Study of Modern Architectures and Regularization Approaches on CheXpert5000

Ihler, Sontje, Kuhnke, Felix, Spindeldreier, Svenja

arXiv.org Artificial IntelligenceFeb-13-2023

Computer aided diagnosis (CAD) has gained an increased amount of attention in the general research community over the last years as an example of a typical limited data application - with experiments on labeled 100k-200k datasets. Although these datasets are still small compared to natural image datasets like ImageNet1k, ImageNet21k and JFT, they are large for annotated medical datasets, where 1k-10k labeled samples are much more common. There is no baseline on which methods to build on in the low data regime. In this work we bridge this gap by providing an extensive study on medical image classification with limited annotations (5k). We present a study of modern architectures applied to a fixed low data regime of 5000 images on the CheXpert dataset. Conclusively we find that models pretrained on ImageNet21k achieve a higher AUC and larger models require less training steps. All models are quite well calibrated even though we only fine-tuned on 5000 training samples. All 'modern' architectures have higher AUC than ResNet50. Regularization of Big Transfer Models with MixUp or Mean Teacher improves calibration, MixUp also improves accuracy. Vision Transformer achieve comparable or on par results to Big Transfer Models.

artificial intelligence, deep learning, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2302.06684

Genre: Research Report (0.82)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Partial annotations for the segmentation of large structures with low annotation cost

Fadida, Bella Specktor, Sourani, Daphna Link, Miller, Liat Ben Sira Elka, Bashat, Dafna Ben, Joskowicz, Leo

arXiv.org Artificial IntelligenceSep-25-2022

Deep learning methods have been shown to be effective for the automatic segmentation of structures and pathologies in medical imaging. However, they require large annotated datasets, whose manual segmentation is a tedious and time-consuming task, especially for large structures. We present a new method of partial annotations that uses a small set of consecutive annotated slices from each scan with an annotation effort that is equal to that of only few annotated cases. The training with partial annotations is performed by using only annotated blocks, incorporating information about slices outside the structure of interest and modifying a batch loss function to consider only the annotated slices. To facilitate training in a low data regime, we use a two-step optimization process. We tested the method with the popular soft Dice loss for the fetal body segmentation task in two MRI sequences, TRUFI and FIESTA, and compared full annotation regime to partial annotations with a similar annotation effort. For TRUFI data, the use of partial annotations yielded slightly better performance on average compared to full annotations with an increase in Dice score from 0.936 to 0.942, and a substantial decrease in Standard Deviations (STD) of Dice score by 22% and Average Symmetric Surface Distance (ASSD) by 15%. For the FIESTA sequence, partial annotations also yielded a decrease in STD of the Dice score and ASSD metrics by 27.5% and 33% respectively for in-distribution data, and a substantial improvement also in average performance on out-of-distribution data, increasing Dice score from 0.84 to 0.9 and decreasing ASSD from 7.46 to 4.01 mm. The two-step optimization process was helpful for partial annotations for both in-distribution and out-of-distribution data. The partial annotations method with the two-step optimizer is therefore recommended to improve segmentation performance under low data regime.

annotation, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-16760-7_2

2209.12216

Country:

Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.05)
Europe > Switzerland (0.04)
North America > Canada > Ontario > National Capital Region > Ottawa (0.04)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Adding more data does not always help: A study in medical conversation summarization with PEGASUS

Nair, Varun, Katariya, Namit, Amatriain, Xavier, Valmianski, Ilya, Kannan, Anitha

arXiv.org Artificial IntelligenceNov-28-2021

Medical conversation summarization is integral in capturing information gathered during interactions between patients and physicians. Summarized conversations are used to facilitate patient hand-offs between physicians, and as part of providing care in the future. Summaries, however, can be time-consuming to produce and require domain expertise. Modern pre-trained NLP models such as PEGASUS have emerged as capable alternatives to human summarization, reaching state-of-the-art performance on many summarization benchmarks. However, many downstream tasks still require at least moderately sized datasets to achieve satisfactory performance. In this work we (1) explore the effect of dataset size on transfer learning medical conversation summarization using PEGASUS and (2) evaluate various iterative labeling strategies in the low-data regime, following their success in the classification setting. We find that model performance saturates with increase in dataset size and that the various active-learning strategies evaluated all show equivalent performance consistent with simple dataset size increase. We also find that naive iterative pseudo-labeling is on-par or slightly worse than no pseudo-labeling. Our work sheds light on the successes and challenges of translating low-data regime techniques in classification to medical conversation summarization and helps guides future work in this space. Relevant code available at \url{https://github.com/curai/curai-research/tree/main/medical-summarization-ML4H-2021}.

conversation summarization, medical conversation summarization, summarization, (15 more...)

arXiv.org Artificial Intelligence

2111.07564

Country: Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Genre: Research Report (0.51)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Data augmentation and pre-trained networks for extremely low data regimes unsupervised visual inspection

Gutierrez, Pierre, Cordier, Antoine, Caldeira, Thaïs, Sautory, Théophile

arXiv.org Machine LearningJun-2-2021

The use of deep features coming from pre-trained neural networks for unsupervised anomaly detection purposes has recently gathered momentum in the computer vision field. In particular, industrial inspection applications can take advantage of such features, as demonstrated by the multiple successes of related methods on the MVTec Anomaly Detection (MVTec AD) dataset. These methods make use of neural networks pre-trained on auxiliary classification tasks such as ImageNet. However, to our knowledge, no comparative study of robustness to the low data regimes between these approaches has been conducted yet. For quality inspection applications, the handling of limited sample sizes may be crucial as large quantities of images are not available for small series. In this work, we aim to compare three approaches based on deep pre-trained features when varying the quantity of available data in MVTec AD: KNN, Mahalanobis, and PaDiM. We show that although these methods are mostly robust to small sample sizes, they still can benefit greatly from using data augmentation in the original image space, which allows to deal with very small production runs.

anomaly detection, arxiv preprint arxiv, data augmentation, (11 more...)

arXiv.org Machine Learning

2106.01277

Country:

South America > Brazil > São Paulo > Campinas (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback